Programming Research Group PRACTICAL BARRIER SYNCHRONISATION
نویسندگان
چکیده
We investigate the performance of barrier synchronisation on both shared-memory and distributed-memory architectures, using a wide range of techniques. The performance results obtained show that distributed-memory architectures behave predictably, although their performance for barrier synchronisation is relatively poor. For shared-memory architectures, a much larger range of implementation techniques are available. We show that asymptotic analysis is useless, and a detailed understanding of the underlying hardware is required to design an effective barrier implementation. We show that a technique using cache coherence is more e ective than semaphoreor lock-based techniques, and is competitive with specialised barrier synchronisation hardware.
منابع مشابه
Practical barrier synchronisation
We investigate the performance of barrier syn-chronisation on both shared-memory and distributed-memory architectures, using a wide range of techniques. The performance results obtained show that distributed-memory architectures behave predictably, although their performance for barrier synchronisation is relatively poor. For shared-memory architectures, a much larger range of implementation te...
متن کاملBarrier Synchronisation for occam-pi
This paper introduces a safe language binding for CSP multiway events (barriers) that has been built into occam-π (an extension of the classical occam language with dynamic parallelism, mobile processes and mobile channels). Barriers provide a simple way for synchronising multiple processes and are the fundamental control mechanism underlying both CSP (Communicating Sequential Processes) and BS...
متن کاملArmus: dynamic deadlock verification for barriers
This paper presents a graph-based dynamic verification algorithm for deadlock detection and avoidance specialised in barrier synchronisation. Barriers are used to coordinate the execution of groups of tasks, and serve as a building block of parallel computing. The synchronisation patterns enabled by current barrier-based abstractions can introduce deadlocks, a major issue in getting parallel ap...
متن کاملSynchronising Changes to Design and Implementation using a Declarative Meta-Programming Language
When developing software systems, the relation between design and implementation is typically left unspecified. As a result design or implementation can be modified independently of each other, and a modification of either one does not leave any trace in the other. The practical result of this is a number of well-known problems such as drift and erosion, documentation maintenance problems or ro...
متن کاملProgramming Research Group A SCHEME FOR THE BSP SCHEDULING OF GENERIC LOOP NESTS
This report presents a scheme for the bulk-synchronous parallel (BSP) scheduling of generic, untightly nested loops. Being targeted at the BSP model of computation, the novel parallelisation scheme yields parallel code which is scalable, portable, and whose cost can be accurately analysed. The scheme comprises three stages: data dependence analysis and potential parallelism identiication, data ...
متن کامل